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HIGH ORDER VARIATIONAL INTEGRATORS IN THE OPTIMAL 
CONTROL OF MECHANICAL SYSTEMS 

CEDRIC M. CAMPOS, SINA OBER-BLOBAUM, AND EMMANUEL TRELAT 


Abstract. In recent years, much effort in designing numerical methods for the simulation and 
optimization of mechanical systems has been put into schemes which are structure preserving. 
One particular class are variational integrators which are momentum preserving and symplectic. 
In this article, we develop two high order variational integrators which distinguish themselves in 
the dimension of the underling space of approximation and we investigate their application to 
finite-dimensional optimal control problems posed with mechanical systems. The convergence 
of state and control variables of the approximated problem is shown. Furthermore, by analyzing 
the adjoint systems of the optimal control problem and its discretized counterpart, we prove 
that, for these particular integrators, dualization and discretization commute. 


1. Introduction 

In practice, solving an optimal control problem requires the a priori choice of a numerical 
method. Many approaches do exist, that are either based on a direct discretization of the optimal 
control problem, resulting into a nonlinear programming problem, or based on the application of 
the Pontryagin Maximum Principle, reducing into a boundary value problem. The first class of 
approaches are called direct, and the second ones, based on the preliminary use of the Pontryagin 
maximum principle, are called indirect. It can be noted that indirect methods, although very 
precise, suffer from an important drawback: Unless one already has a good knowledge of the 
optimal solution, they are very difficult to initialize since they are extremely sensitive to the 
initial guess. Although many solutions exist in order to carefully initialize a shooting method (see 
mm), in most of engineering applications direct approaches are preferred due to their simplicity 
and robustness. Roughly speaking, direct methods consist of 

(1) discretizing first the cost and the differential system, in order to reduce the optimal control 
problem to a usual nonlinear minimization problem with constraints, also called nonlinear 
programming problem (NLP), with dimension inversely proportional to the smoothness 
of the discretization; 

(2) and then dualizing, by applying for instance a Lagrange-Newton method to the NLP, 
deriving the Karush-Kuhn-Tucker equations (KKT), also called discrete adjoint system, 
and applying a Newton method to solve the resulting optimality system. 

Many variants exist, e.g. la). In contrast, indirect methods consist of 

(1) first dualizing the optimal control problem to derive the adjoint system, by applying 
the Pontryagin Maximum Principle (PMP) (or, equivalently, the Lagrange multipliers 
necessary condition for optimality in infinite dimension), 

(2) and then discretizing, by applying a shooting method (that is, a Newton method composed 
with a numerical integration method). 

In shorter words, direct methods consist of 1) discretize, 2) dualize, and indirect methods consist of 
the converse: 1) dualize, 2) discretize. It is natural to wonder whether this diagram is commutative 
or not, under usual approximation assumptions. 

Since the pioneering works of mm, it is well known by now that, in spite of usual assumptions 
ensuring consistency and stability, direct methods may diverge. In other words discretization and 
dualization do not commute in general. This is due to a complex interplay with the mesh, to the 


1991 Mathematics Subject Classification. Primary 65P10, Secondary 65L06, 65K10, 49Mxx. 

Key words and phrases, optimal control, mechanical systems, geometric integration, variational integrator, high 
order, Runge-Kutta, direct methods, commutation property. 


1 



2 


C. M. CAMPOS. S. OBER-BLOBAUM, AND E. TRELAT 


appearance of spurious highfrequencies ultimately causing the divergence (see |17| for very simple 
finite-dimensional linear quadratic problems and |58| for infinite-dimensional wave equations). 

Several remedies and solutions exist, from which m ng na sg are a representative sample. 
For instance, the results of HZ] assert the convergence of optimal control problems under specific 
smoothness and coercivity assumptions provided that the underlying discretization method be 
based on a Runge-Kutta method. However, the convergence order of the optimal control solution, 
which depends on the convergence order of the state and the resulting adjoint scheme, is reduced 
compared to the order of the Runge-Kutta method applied to the state system. Indeed, the 
discrete state and adjoint system constitutes a symplectic partitioned Runge-Kutta method for 
which order conditions on the Runge-Kutta coefficients are derived to preserve the convergence 
rates. Whereas in m the symplectic partitioned Runge-Kutta scheme for state and adjoint is 
explicitly derived, in a recent overview article |44| a proof is given based on a relation between 
quadratic invariants (that are also naturally preserved by symplectic partitioned Runge-Kutta 
integrators) and the fulfillment of the KKT equations. The preservation of convergence rates is 
referred to as the Covector Mapping Principle (CMP) (see e.g. [H]). More precisely, the CMP 
is satisfied if there exists an order-preserving map between the adjoint variables corresponding to 
the dualized discrete problem (KKT) and the discretized dual problem (discretized PMP). For the 
class of Legendre pseudospectral methods the CMP is proven if additional closure conditions are 
satisfied (see |16l I48| L whereas for Runge-Kutta methods the CMP holds if the order conditions 
on the Runge-Kutta coefficients derived in HZ] are satisfied. For a detailed discussion on the 
commutation properties we refer to |42| . 

While for general dynamical systems, many studies of discretizations of optimal control prob¬ 
lems are based on Runge-Kutta methods (see e.g. |12l[Tin[TRll^[^ l. particularly for mechanical 
systems, much effort in designing numerical methods for the simulation and optimization of such 
systems has been put into schemes which are structure preserving in the sense that important 
qualitative features of the original dynamics are preserved in its discretization (for an overview on 
structure preserving integration methods see e.g. HSI). One special class of structure preserving 
integrators is the class of variational integrators, introduced in isgiig, which are symplectic and 
momentum-preserving and have an excellent long-time energy behavior. Variational integrators 
are based on a discrete variational formulation of the underlying system, e.g. based on a discrete 
version of Hamilton’s principle or of the Lagrange-d’Alembert principle for conservative 150] 
or dissipative [H] mechanical systems, respectively. They have been further extended to different 
kind of systems and applications, e.g. towards constrained HI1I2S1EII, non smooth HU, multirate 
and multiscale |82l HH] , Lagrangian PDF systems [3S] and electric circuits m- In the cited 
works, typically quadrature rules of first or second order are used in order to approximate the 
action functional of the system. To design high order variational integrators, higher order quadra¬ 
ture rules based on polynomial collocation can be employed. Such so called Galerkin variational 
integrators have been introduced in |36| and further studied in nnoiiiziissiiini- 

For numerically solving optimal control problems by means of a direct method, the use of 
variational integrators for the discretization of the problem has been proposed in iHiisniini]. This 
approach, denoted by DMOC (Discrete Mechanics and Optimal Control), yields a finite-difference 
type discretization of the dynamical constraints of the problem which by construction preserves 
important structural properties of the system, like the evolution of momentum maps associated 
to symmetries of the Lagrangian or the energy behavior. For one class of Galerkin variational 
integrators, that is equivalent to the class of symplectic partitioned Runge-Kutta methods, the 
adjoint system and its convergence rates are analyzed in [39]. It is shown that, in contrast 
to a discretization based on standard Runge-Kutta methods in HZ], the convergence order of 
the discrete adjoint system is the same as for the state system due to the symplecticity of the 
discretization scheme. In particular, we obtain the same symplectic-momentum scheme for both 
state and adjoint system, that means that discretization and dualization commute for this class 
of symplectic schemes and the GMP is satisfied. For general classes of variational integrators, the 
commutation property is still an open question. The contribution of this work is twofold: 


(1) We derive two different kinds of high order variational integrators based on different dimen¬ 
sions of the underling polynomial approximation (Section . While the first well-known 
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integrator is equivalent to a symplectic partitioned Runge-Kutta method, the second inte¬ 
grator, denoted as symplectic Galerkin integrator, yields a “new” method which in general, 
cannot be written as a standard symplectic Runge-Kutta scheme. 

(2) For the application of high order variational integrators to finite-dimensional optimal 
control problems posed with mechanical systems, we show the convergence of state and 
control variables and prove the commutation of discretization and dualization (Sections]^ 
and[^. 

The paper is structured as follows: In Section the optimal control problem for a mechanical 
system is introduced. Its discrete version is formulated in Section based on the derivation 
of two different kinds of high order variational integrators. The first main result is stated in 
Section Under specific assumptions on the problem setting we prove the convergence of the 
primal variables for an appropriate choice of the discrete controls (Theorem |4.1| . Along the lines 
of |18| , we demonstrate the influence of the control discretization on the convergence behavior by 
means of several examples. In Section the discrete adjoint system for the symplectic Galerkin 
method is derived. It turns out that the reformulation of the variational scheme in Section [3] 
simplifies the analysis. Whereas commutation of discretization and dualization for symplectic 
partitioned Runge-Kutta methods has already been shown in |39] . we prove the same commutation 
property for the symplectic Galerkin discretization (Theorem |5.2[ ), which is the second main result 
of this work. In contrast to the discretization with Legendre pseudospectral methods or classical 
Runge-Kutta methods, no additional closure conditions (see (TB]) or conditions on the Runge- 
Kutta coefficients (see [IT]) are required to satisfy the GMP, respectively. Furthermore, using the 
high order variational integrators presented here, not only the order but also the discretization 
scheme itself is preserved, i.e. one yields the same schemes for the state and the adjoint system. 
We conclude with a summary of the results and an outlook for future work in Section 

2. Optimal control for mechanical systems 


2.1. Lagrangian dynamics. One of the main subjects of Geometric Mechanics is the study 
of dynamical systems governed by a Lagrangian. Typically, one considers a mechanical system 
with configuration manifold Q C K" together with a Lagrangian function L: TQ —>■ K, where 
the associated state space TQ describes the position q and velocity g of a particle moving in 
the system. Usually, the Lagrangian takes the form of kinetic minus potential energy, L{q, q) = 
K{q,q) — V{q) — \ q^ ■ M{q) ■ q — V{q), for some (positive definite) mass matrix M{q). 

A consequence of the principle of least action, also known as Hamilton’s principle, establishes 
that the natural motions q: [0,T] —>■ Q of the system are characterized by stationary solutions of 
the action, thus, motions satisfying 

(1) 6 [ L{q{t),q{t))dt = 0 

Jo 

for zero initial and final variations (5g(0) = 5q{T) = 0. The resulting equations of motion are the 
Euler-Lagrange equations (refer to |T]), 


( 2 ) 


^ 

dt dq dq 


When the Lagrangian is regular, that is when the velocity Hessian matrix d^L/dq^ is non¬ 
degenerate, the Lagrangian induces a well defined map, the Lagrangian flow, Ff: TQ —^ TQ 
by Fl{qQ,qo) ■= {q{t),q{t)), where q G C^([0,T],Q) is the unique solution of the Euler-Lagrange 
equation Q with initial condition (go, go) S TQ. By means of the Legendre transform legj;^ : 
(g,g) € TQ i—)■ {q,p = ^\{q,q)) G T*Q, where T*Q is the phase space of positions g and momenta 
p, one may transform the Lagrangian flow into the Hamiltonian flow Fh-- T*Q ^ T*Q defined 
by Flj{qo,po) := legi(g(t),g(t)). 

Moreover, different preservation laws are present in these systems. For instance the Hamiltonian 
flow preserves the natural symplectic structure of T*Q and the total energy of the system, typically 
FJ{q,p) = K{q,p) -I- U(g) = ■ M{q)~^ ' p — VIq) (here K still denotes the kinetic energy, but 

depending on p rather than on g). Also, if the Lagrangian possess Lie group symmetries, then 
Noether’s theorem asserts that the associated momentum maps are conserved, like for instance 
the linear momentum and/or the angular momentum. 
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If external (non conservative) forces F: {q,q) G TQ i—>■ {q,F{q,q)) G T*Q are present in the 
system, Hamilton’s principle Q is replaced by the Lagrange-d’Alembert principle seeking for 
curves that satisfy the relation 

L{q,q)(ltF f F{q,q) ■ 6qdt = 0 

Jo 



for zero boundary variations ^^(O) = Sq{T) = 0, where the second term is denoted as virtual work. 
This principle leads to the forced Euler-Lagrange equations 


( 4 ) 


d dL dL 
dt dq dq 


F{q,q). 


The forced version of Noether’s theorem (see e.g. [ 33 ]) states that if the force acts orthogonal to 
the symmetry action, then momentum maps are still preserved by the flow. Otherwise, the change 
in momentum maps and energy is determined by the amount of forcing in the system. 


2.2. Optimal control problem. Since we are interested in optimally controlling Lagrangian 
systems, we assume that the mechanical system may be driven by means of some time dependent 
control parameter m : [0, T] —>■ {7 with U C K™ being the control set. Typically, the control 
appears as an extra variable in the external force such that in the following we consider forces of 
the form F : TQ x U ^ T* Q and we replace the right-hand side of Q by the control dependent 
force term F(q, q, u). 

An optimal control problem for a mechanical system reads (also denoted as Lagranigan optimal 
control problem in |39p 


Problem 2.1 (Lagrangian optimal control problem (LOCP)). 


(5a) 

min J(g, q, u) = 

<1,4,u J 

f C{q{t), q{t),u{t)) dt + ‘^{q{T),q{T)) 

0 

subject to 



(5b) 

S [ L{q{t),q{t))dt + 
Jo 

[ ■ Sq{t)dt = 0, 

Jo 

(5c) 

(g(0),q(0)) = (g°,9°). 


with minimization over q G T], Q) = W'^'°°{[0,T],Q), q G W^'°°{[0,T],TqQ) and u G 

L°°{[0,T],U). The interval length T may either be fixed, or appear as degree of freedom in 
the optimization problem. Since any optimal control problem with free final time can easily be 
transformed into a problem with fixed final time (see e.g. |15|). we assume the time T to be fixed 
from now on. The control set U C ffi™ is assumed to be closed and convex, and the density cost 
function C: TQ xU i—>■ R and the final cost function $: TQ i—>■ M” are continuously differentiable, 
being $ moreover bounded from below. 


Henceforth we should assume that the Lagrangian is regular, i.e. there is a (local) one-to-one 
correspondence between the velocity q and the momentum p via the Legendre transform and its 


inverse 


and 


(9,P)- 


Thus, the forced Euler-Lagrange equations @ can be transformed into the partitioned system 

(6) qit) = f{q(t),p{t )), p{t) = g{q(t),p{t),u{t)) 

with 


( 7 ) 


f{TP) = 


dL 

dq 


dL 

{q,p) and g{q,p,u) = —{qj{q,p)) + F {qj{q,p),u) . 


With some abuse of notation we denote the force and the cost functions defined on T*Q x U 
and T*Q, respectively, by F{q,p,u) := F{q, f{q,p),u), C{q,p,u) := C{q, f{q,p),u) and <^{q,p) := 


^{q, f{q,p)) such that Problem 2.1 can be formulated as an optimal control problem for the 
partitioned system (§. 
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Problem 2.2 (Optimal control problem (OOP)). 

(8a) min J{q,p,u) = [ C{q{t),p{t),u{t)) dt + <^{q{T),p{T)) 

g,p,u Jo 

subject to 

(8b) q{t) = f{qit),p{t)), g(0) = g°, 

(8c) p{t) = g{q(t),p{t), u{t )), p{0) = /, 

with minimization over g e W^’°°{[0,T],Q), p G W^'°°{[Q,T],T*Q) and u S L°°([0, T], t7) and the 
functions /: T*Q ^ M", g:T*Q xU ^ K" are assumed to be Lipschitz continuous. 


The first order necessary optimality conditions can be derived by means of the Hamiltonian for 
the optimal control problem given by 

(9) n{q,p,u,X,il},po) = poC{q,p,u) + A • f{q,p) + V'' g{q,P,u) 

with po G K and A and tp are covectors in K”. 


Theorem 2.3 (Minimum Principle, e.g. [H]). Let {q*,p*, u*) € tT^’°°([0, T], Q) xlT^’°°([0, T], T*.Q)x 
L°°([0,T], U) be an optimal solution to Problem 2.2 Then there exist funetions A € VP^’'^([0,T],R”) 
and Ip G lT^’°°([0,r],IR”) and a constant po > 0 satisfying {po,X,'ip) ^ (0,0,0) for all t G [0,T] 
such that 


(10a) n{q*{t),p*{t),u*{t),X{t),tp{t),po) = minn{q{t),p{t),u,X{t),tp{t),po) , 

u£U 

for t G [0,T], and (poj A,'0) solves the following initial value problem: 

(10b) X = -V,'H{q\p\u*,X,^P,po), A(T)=poV,$(g*(T),p*(r)), 

(10c) ^ = -VpH{q*,p\u*,X,P;,po), P;{T)=poVp^q*{T),p*{T)). 


The vectors X{t) and ipit) are the costate or the adjoint variables of the Hamiltonian equations 
of optimal control. The scalar po is called the abnormal multiplier. In the abnormal case, it holds 
Po = 0, and otherwise the multiplier can be normalized to po = 1- Since no final constraint on 
the state is present in the optimal control problem, the above principle holds true with po = 1 (as 
proved for instance in m)- 


Remark 2.4. If g is affine w.r.t. u, then the topologies can be taken as for the controls, 
on q and on p, and the PMP would still be valid for these classes. Besides, 

optimal control problems where the optimal control is in but not in are very seldom. For 
instance, if one is able to express u in function of (q,p, A, ip), as for the assumptions of Theorem 


5.2 then u is clearly in L° 


3. Discretization 


Since we are interested in solving optimal control problems by some kind of direct method, a 
discrete approximation of Problem |2.2| is required. To this end, we first introduce two different 


variational integrators that we employ for the approximation of the control system given in (8b I- 
(8cl. Based on these discrete schemes, we derive the discrete approximations of the optimal 
control problem that can be solved by standard numerical optimization methods. The controls 
play no role in the derivations of the variational integrators, therefore we will omit temporarily 
the dependence of the external force F on u, which will lighten the notation. The discrete schemes 
including the approximation of the controls are given in Section |3.3[ 


3.1. Discrete Mechanics and Variational Integrators. Discrete Mechanics is, roughly speak¬ 
ing, a discretization of Geometric Mechanics theory. As a result, one obtains a set of discrete 
equations corresponding to the Euler-Lagrange equation Q above but, instead of a direct dis¬ 
cretization of the ODE, the latter are derived from a discretization of the base objects of the theory, 
the state space TQ, the Lagrangian L, etc. In fact, one seeks for a sequence {(toi^o)! (^ij9i)i ■ ■ ■ i 
(tAfi^Ai)} that approximates the actual trajectory q{t) of the system (qk ~ q{tk)), for a constant 
time-step h = > 0. 
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A variational integrator is an iterative rule that outputs this sequence and it is derived in an 
analogous manner to the continuous framework. Given a discrete Lagrangian : Q x Q —>■ K and 
discrete forces : Q x Q ^ T*Q, which are in principle thought to approximate the continuous 
Lagrangian action and the virtual work, respectively, over a short time 

rtk+1 

(lla) Ld{qk,qk+i)^ L{q{t),q{t))dt, 


(11b) 


Fd iqk,qk+i) ■ Sqk + Fj;{qk,qk+i) ■ Sqk+i 


'tk 


F{q{t)A{F)) ■ 5q{t)At, 


'tk 


one applies a variational principle to derive the well-known forced discrete Euler-Lagrange (DEL) 
equation, 

(12) DiLd{qk,qk+i) + D 2 Ld{qk-i,qk) + F~{qk,qk+i) + F^{qk-i,qk) = 0, 

for fc = 1,..., fV — 1, where Di stands for the partial derivative with respect to the Ath component. 
The equation defines an integration rule of the type {qk-i, qk) (qk, qk+i), however if we define 
the pre- and post-momenta (also denoted as discrete Legendre transforms) 

(13a) p'^ :=-DiLd{qk,qk+i) - F)f {qk,qk+i), k = 0,...,N-l, and 

(13b) := D2Ld(<?fc-i,<?fc)+F+((7fc_i,gfc), k = l,...,N, 

the discrete Euler-Lagrange equation (121 is read as the momentum matching pf = p^ =: pk and 


defines an integration rule of the type {qk,Pk) t iqk+i,Pk+i)- 

The nice part of the story is that the integrators derived in this way naturally preserve (or nearly 
preserve) the quantities that are preserved in the continuous framework, the symplectic form, the 
total energy (for conservative systems) and, in presence of symmetries, the linear and/or angular 
momentum (for more details, see |36|L Furthermore, other aspects of the continuous theory can 
be “easily” adapted, symmetry reduction [3 HOI mi, constraints [251 [25] , control forces El [Si , etc. 


3.2. High Order Variational Integrators. High order variational integrators for time depen¬ 
dent or independent systems (HOVI[t]) are a class of integrators that, by using a multi-stage 
approach, aim at a high order accuracy on the computation of the natural trajectories of a 
mechanical system while preserving some intrinsic properties of such systems. In particular, 
symplectic-partitioned Runge-Kutta methods (spRK) and, what we call here, symplectic Galerkin 
methods (sG) are s-stage variational integrators of order up to 2s. 

The derivation of these methods follows the general scheme that comes next, the specifics of each 
particular case are detailed in the following subsections. For a fixed time step h, one considers 
a series of points qk, refereed as macro-nodes. Between each couple of macro-nodes {qk,qk+i), 
one also considers a set of micro-data, the s stages: For the particular cases of sG and spRK 
methods, we consider micro-nodes Qi,... ,Qs and micro-velocities Qi,... ,Qs, respectively. Both 
macro-nodes and micro-data (micro-nodes or micro-velocities) are required to satisfy a variational 
principle, giving rise to a set of equations, which properly combined, define the final integrator. 

Here and after, we will use the following notation: Let 0 < ci < ... < Cs < 1 denote a set of 
collocation points and consider the associated Lagrange polynomials and nodal weights, that is, 

V{t)-.= \\— — — and bj := [ P {t)dt, 

A Cj - Ci Jo 


respectively. Note that the pair of (ci, &i)’s define a quadrature rule and that, for appropriate c^’s, 
this rule may be a Gaussian-like quadrature, for instance, Gauss-Legendre, Gauss-Lobatto, Radau 
or Ghebyshev. 

Now, for the sake of simplicity and independently of the method, we will use the same notation 
for the nodal coefficients. We define for spRK and sG, respectively. 


(14) 


OjH ■ — 


P{t)dt 


and 


CLij .— 


dP 

dt 


Moreover, for spRK, we will also use the nodal weights and coefficients {bj,aij) given by Equation 
(221 and, for sG, the source and target coefficients 


a^:=P{0) and /3^:=F(1). 
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Finally, if L denotes a Lagrangian from M" x K" to M coupled with an external force F: {q, q) G 
X K” i-A {q, F{q, q)) G K" x M", then we define 

dq {Qi,Qi) 

where {Qi, Qi) are couples of micro-nodes and micro-velocities given by each method. Besides, Di 
will stand for the partial derivative with respect to the i-th component. 




and 


p. .= 9L 

* ■ dq 


-F=^ 
" dq 


(0..Q.) 


+ F(Qi, Qi) 


3.2.1. Symplectic-Partitioned Runge-Kutta Methods. Although the variational derivation of spRK 
methods in the framework of Geometric Mechanics is an already known fact (see [33] for an 
“intrinsic” derivation, as the current, or m for a “constrained” one), both based on the original 
works of miiisj, we present it here again in order to ease the understanding of and the comparison 
with sG methods below. 

Given a point go G ^-nd vectors C K", we define the polynomial curves 


We have 

(15) 


C(t) := E P(t/h)Qj and 
i=i 


Qit) ■■= 


qo 




otjh 


ippdrQj 


Qi — Q)h * Ci) and Qi •— Q(^h * Ci) — qQ -1- h aijQj . 

i=i 


Note that the polynomial curve Q is uniquely determined by go and In fact, it is the 

unique polynomial curve Q of degree s such that Q(0) = go and Q{h ■ Ci) = Qi. However, if we 
define the configuration point 

S 

(16) gi := Q{h • 1) = go -f /i ^ bjQj 

i=i 


and consider it fixed, then Q is uniquely determined by go, gi and the Qi’s but one. Namely, take 
any 1 < io < s such that bi„ ^ 0 and fix it, then 


Qio 



E /K 

jAio / 


We now define the multi-vector discrete Lagrangian 


(17) Ld(Qi,...,4) :=/i^6,L(Q„4) 

2=1 


and the multi-vector discrete force 

s 

Fd{Qi,.. ..Qs) ■ {bQi,.. .,SQs) ■= h'^b^F{Qi,Qi) ■ 6Qi. 

i=l 


Although not explicitly stated, they both depend also on go. If we write the micro-node variations 
SQi in terms of the micro-velocity variations SQi (by definition ( |15[ )), we have that the multi-vector 
discrete force is 

S S 

Fd{Qi,..., Qs) ■ {5Qi,..., SQs) = EE biaijF(^Qi^ Qi) • 6Qj . 

1=1i=l 

The two-point discrete Lagrangian is then 

(18) Z/^(go, gi) . ext LdiQii • • • tQs) 

'P‘‘{lO,h],R’',qo,qi) 


where 7^®([0,/i], K", go, gi) is the space of polynomials Q of order s from [0,/i] to K” such that 
Q(0) = go and Q{h) — gi and the vectors Qi’s determine such polynomials as discussed above. 
The so called “extremal” is realized by a polynomial Q G 'P'’([0,/i], M", go, gi) such that 

(19) SLdiQi ,..., Qs) • {SQh • • • j ^Qs) + FdiQij ■ • •) Qs) • (<5Qi) • ■ •, SQs) = 0 
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for any variations {SQi,... ,SQs), taking into account that Sqo = Sqi = 0 and that SQi 
9Qio/dQjSQj. For convenience, the previous equation (191 is developed afterwards. 
The two-point discrete forward and backward forces are then 

(20) F^{qo,qi) ■ S{qo,qi) := h'^biF{Q„Qi) ■ ^^6q± , 

where q- = qq and = qi. Using the previous relations, we may write 

s s 

F^ — ^ ^ ( ^i(l IF-i and F^ — ^ ^ ( b^Ojn^/b^^Fi. 


By the momenta-matching rule ( |13[ ), we have that 

Po — ■ t Qs')/[hbif^') F , Qs) F F^ , 

Pi = DiqLdiQi,...,Qs)/{hbi„)FF;l. 

where Dq^ stands for the partial derivative with respect to qg. Combining both equations, 
obtain that 

S S 

FiqLd “t" h ^ ) bi(iiiqF 2 — hbiqPi and pi — pq -t- Dq^Ld -i- ^ ) b^F^ . 


Coming back to Equation ( |19[ ), we have that 

0 = SLdiQi,... ,Qs) ■ (SQi,... ,dQs) F Fd{Qi,... ,Qs) ■ [SQi,... ,dQs) 

= I DjLd -k biaijFi F ( Di^Ld -k biUu^Fi j j 5Qj . 

j^io \ i=l 

Therefore, for j ^ io, we have that 


dQ 


FjLd “k h ^ ) b^ciijFi bjjbiq ■ I D^^Ld F h ^ ) b^dn^Fi j . 

Z = 1 \ i=l / 

Thus, the integrator is defined by 

S 

(21a) DjLdi^Qi^ ■ ■ ■, Qs) F h ^ ) biciijF^ — hbjpi , 


(21b) 


qi = qo Fh'^bjQj , 


(21c) pi=pqF DqqLdiQi, ...^Qs) Fh'^biFi. 

i=l 

Besides, using the definition of the discrete Lagrangian, we have 

S S 

DjLdiQi, ■. ■ ,Qs) F h'^'^bidijFi = h’^'^bidijPi F hbjPj , 


1 = 1 


i=l 


DqqLd{Qi,...,Qs)Fh'^biFi = h bjPj. 

i=l i=l 

Therefore, we may write 

s s 

Pj — Pq h ^ ^ ^ 2(1 /^j^Pi — PO T ^ ^ ^ OjjiPi , 

i=l i=l 

s s 

Pi=PoPh^ biPi = po + ^^ hiPi , 


i=l 


i=l 


were aij and bi are given by 

( 22 ) 


biQij + bjaji — bibj , bi — bi. 
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In summary, the equations that define the spRK integrator (with forces), are together with (22) 


(23a) 

qi =qo + : 

Pi =Po + hY^jPj ’ 




(23b) 

Qi — qo T ^ ^ ^ eiijQj , 

Pi —PO F h ^ ( CLij Pj , 


i=i 


(23c) 

7 ” 

Pi =-^{Qi, Qi), 

Pi =-^(Qi! Qi) + F{Q. 


3.2.2. Symplectic Galerkin Methods. Galerkin methods are a class of methods to transform a 
problem given by a continuous operator (such as a differential operator) to a discrete problem. As 
such, spRK methods fall into the scope of this technique and could be also classified as “symplectic 
Galerkin” methods. However, we want to stress on the difference between what is called spRK in 
the literature and what we refer here as sG. The wording should not be confused by the one used 
in 135]. 

Given points C K”, we define the polynomial curves 

S 1 ^ 

Q{t) := Y, and Q{t) := v ^ P{t/h)Qj . 




t=i 


We have 


1 ® 

Qi — Q(yh * Cj) and .— Q(^h * Cj) — ^ ^ ^ o.ijQj . 


i=i 


Note that the polynomial curve Q is uniquely determined by the points In fact, it is 

the unique polynomial curve Q of degree s — 1 such that Q{h ■ Ci) = Qi. However, if we define the 
configuration points 


(24) 


qo ■= Q{h ■ 0) =Y'^^Q 3 gi := Q{h ■ 1 ) = 

i=i i=i 

and consider them fixed, then Q is uniquely determined by qo, qi and the Qi’s but a couple. For 
instance, we may consider Qi and Qs as functions of the others, since the relations (24) define a 
system of linear equations where the coefficient matrix has determinant 7 := a^/3® — ^ 0 (if 

and only if ci ^ Cg). More precisely. 


Qi 

Qs 


qo - Ej=2 
9i - Ei=2 


We now define the multi-point discrete Lagrangian 

S 

(25) Ld{Qu...,Qs)-=hYhL{Q,,Qi) 
and the multi-vector discrete force 

s 

Fd{Qi,.. .,Qs) ■ {5Qi,.. .,5Qs) ■= h'YhF{Qi,Qi) ■ 5Qi. 

The two-point discrete Lagrangian is then 

(26) Ld{qo,qi) ■■= ext Ld{Qi,.. . ,Qs) 

'P‘’(lO,h],R‘",qo,qi) 

where ^*([0, h], K", qo, qi) is the space of polynomials Q of order s from [0, h] to M" such that the 
points Qi’s determine such polynomials as discussed above. The so called “extremal” is realized 
by a polynomial Q G 'P®([0, h], K", goi 9i) such that 

(27) SLdiQi, ...,Qs)- {SQi,..., SQs) + Fd{Qi, ...,Qs)- {SQi,..., SQs) = 0 

for any variations { 6 Q 1 ,... ,dQs), taking into account that Sqo = Sqi = 0 and that SQi = 
Si =2 dQi/dQjSQj, i = l,s. For convenience, the previous equation (271 is developed afterwards. 
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The two-point discrete forward and backward forces are then formally defined by Equation 


(20l. Using the previous relations, we may write 

= HPshFi - l5ihsFs)h and F+ = h{aibsFs - asbiFi)/^ . 
By the momenta-matching rule ( |13[ ), we have that 

-Po = h ■ {DiLd + hbiFi) - /^ ■ {DsLd + hbsFs) and 
Pi = -a'^h ■ {DiLd + hbiFi) + aV7 ’ {DsLd + hb^F^). 

By a linear transformation of both equations, we obtain 

DiLdiQi, ■■■, Qs) + hbiFi = -a^po + P^pi and 
DsLdiQi, ...,Qs) + hbsFs = -a^po + P’^Pi ■ 

Coming back to Equation ( [2^ , we have that 

0 = {5Ld{Qi, ...,Qs) + Fd{Qi ,..., Qs)) ■ (<5Qi, - ■ •, 5Qs) 


= E 

1=2 


{DiLd + hbiFi) -I- {DjLd -|- hbjFj) + (DgLd + hbgFg) ^ 


SQ 


3 ■ 


Therefore, for j = 2,..., s — 1, we obtain 

'y{DjLd + hbjFj) = {a^ P""-a’’P^){DiLd + hbiFi) + {a'^ P^-a^ P^){DsLd + hb^Fg) 
= {a^P‘‘-a’^P^){P^pi-a^po). 

Thus, the integrator is defined by 

(28a) DjLdiQi, ■■■, Qs) + hbjFj = -a^po + P^pi , j = 1,..., s; 

S S 

(28b) qo = '^a^Qj and gi = P^Qj 

1=1 1=1 

Besides, using the definition of the discrete Lagrangian, we have 


dL 


DjLd{Qi,...,Qs) = h^bt^ — 



dL 

idQj 

_ _ 

dq 


FjLdi^Qi^ ■ • ■ 5 Qs) F hbjFj hbjPj -t- ^ ) bj^d^jP^. 

1=1 


dQ^ 

^dQj 


Therefore, we may simply write 

S 

hbjPj + b^aijPi = -a^po + P^pi. 

1=1 

In summary and for a proper comparison, we write the equations that define the sG integrator 
(with forces) in a pRK way, that is 


(29a) 

(29b) 


<70 


= J2a^Qj, 


1=1 
1 


q,=J 2 p^Q^, 

1=1 


Qi — 7 'y ( ttijQj ) 


p. = 


Tpi - a >0 , 1 


1=1 


hbi 


L E “* 1^1 ’ 


1=1 


dL 

Pr =-^{Q^,Qi), 


dL, 


P^=^{Q^.Qi) + F{Q,,Q^), 


(29c) 

where bidij + bjdji = 0 and bi = bj. 

We remark that Equation (28a I generalizes the ones obtained in HiEg, where the collocation 
points are chosen such that ci = 0 and Cg = 1, which is a rather particular case. 
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3.2.3. Similarities and differences between spRK and sG. As already mentioned, both methods 
can be considered of Galerkin type. In this sense, spRK and sG could be refereed as a symplectic 
Galerkin integrators of 1st and 0th kind, respectively, since spRK is derived from the 1st derivative 
of an extremal polynomial and sG from the polynomial itself. At this point, a very natural 
question could arise: Are spRK and sG actually two different integrator schemes? Even though 
the derivations of both methods are quite similar, they are in general different (although they 
could coincide for particular choices of the Lagrangian, the collocation points and the integral 
quadrature). A weak but still fair argument to support this is that, at each step, spRK relies on 
the determination of the micro-velocities Qii while sG does so on the micro-nodes Qi. All the 
other “unknowns” are then computed from the determined micro-data. 

In the simplest of the cases, that is, the case where one considers a Lagrangian of the form 
kinetic minus potential energy, L{q, q) = Mq — U{q), with M a constant mass matrix; s = 2 
micro-nodes (inner-stages); and Lobatto’s quadrature, ci = 0, C 2 = 1; one may show that both 
schemes, spRK (231 and sG (291, reduce to the well-known leap-frog or Verlet method. They will 
differ when the previous main assumptions are violated, for instance if M is not constant or the 
quadrature is other than Lobatto’s. 


Example 3.1. We consider a Lagrangian with a scalar mass matrix dependent on the configuration, 
that is, a Lagrangian of the form L{q,q) = iA(g)||g|p — V{q), with A: Q —>■ K. Under this 
assumption and noting A 1/2 := (V)Ai := (V)A(( 7 i), (V)U := (V)U(gi), i = 0,1, the spRK 

scheme (231 as well as the the sG scheme (|2^ reduce to 


Pl/2 =P0+ ^ 


= 90 + 2 


Pi =Pl/2 + 



with a slight difference in the subindexes appearing in the framed factors. While in the spRK 
scheme, a = 0 and b = 1; in the sG scheme a = b = l/2. It is important to note that, even though 
the difference is small, it makes both schemes certainly different. Besides notice that the first two 
equations define pi /2 and qi implicitly and that the whole set reduces to the Verlet method for a 
constant A. Indeed, it is shown in IS1I3S] that for a Lagrangian with constant mass matrix and 
Lobatto quadrature rule, the sG and the spRK method coincide. 


3.2.4. Order of the schemes. With respect to the accuracy of the schemes, for any Gaussian-like 
quadrature (Gauss-Legendre, Gauss-Lobatto, Radau and Ghebyshev) and any method (spRK and 
sG), the schemes have a convergence order up to 2s (which is only attained by the combination 
Gauss-Lobatto together with spRK) but no lower than 2s—2, being s the number of internal stages, 
see Table ^ We emphasize that these orders have been determined numerically experimenting 
with several “toy examples” for which exact solutions are known, e.g. the harmonic oscillator and 
the 2 body problem (see Section]^, however they coincide with the known analytic results when 
available, that is spRK together with the Gauss-Legendre or Gauss-Lobatto quadratures (see e.g. 

m)- 


3 . 3 . Discrete optimal control problem. For the discretization of the optimal control prob¬ 
lem |2.1[ we employ the class of high order variational integrators. By choosing an appropriate 
approximation Jd of the cost functional J, the general discrete Lagrangian optimal control problem 
as discretization of the Langrangian optimal control problem |2.1| reads (see also |39p 

Problem 3.2 (Discrete Lagrangian optimal control problem). 

(30a) min Jd{{qk,Uk}k=o) 

{qk,uk}"^o 
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spRK 

sG 

micro-data 

Qv 

Q^ 

polynomial degree 

s 

5—1 

variational eq.’s 

s -1- 1 

S 

extra equations 

1 

2 

quadrature 

Gauss-Legendre 

2 s 

2 s-2 

Gauss-Lobatto 

2 s-2 

2 s-2 

Radau 

2 s- 1 

2 s-2 

Ghebyshev 

2 s-2 

2 s-2 


order method 


Table 1. Comparison of s-stage variational integrators. 


subject to 

(30b) go = 

(30c) D 2 L{q^ + DiLd{qQ,qi) + Fq =0, 

(30d) D 2 Ld{qk-i,qk) + DiLd{qk, Qk+i) + Fj^{qk-i,qk) + (gfe, gfc+i) = 0, 


for /c = — 1 and where Equation (30dl is the forced discrete Euler-Lagrange equation 

defined in (121 and Equations (30bl-(30cl correspond to the initial condition (5c I expressed by 
means of the discrete Legendre transform (13). Here, the control trajectory u is approximated 

u{tk). Note that for the controlled case, 


the F^ are dependent on Uk- To specify the discrete optimal control problem, in particular, the 


by the discrete values Uk, k — 0, ... such that Uk 

d 

approximation of the control parameter u and the cost functional J, we focus on the high order 
variational integrators derived in Section |3.2[ namely the spRK and the sG method, and find a 
discrete version of the more general optimal control problem |2.2| 


As for the approximation of q(t) and g(t), we also use a polynomial for the approximation of 
the control function u{t) on [0, h\. For a given set of collocation points 0 < ci < ... < Cg < 1 and 
given control points G U we define the polynomial of degree s — 1 


U{t) :=J2l\t/h)U, 


i=i 


such that Ui = U(h ■ Ci), i = 1,..., s. Note that the control polynomial h{{t) has the same degree 
as the polynomial Q{t) for the sG integrator, whereas for the spRK integrator it coincides with the 
polynomial degree of Q{t). To take in consideration the control dependent force in the previous 
derivation of the spRK and the sG schemes into account, we replace in the definitions for Ft in 


Equation (201 the external force Fi — F{Qi, Qi) by Fi = F{Qi,Qi, Ui). Furthermore, for a regular 
Lagrangian and by using the definitions for / and g in ([^, we can write Equations (23c I and (29c) 


as 


f{Q^,P^) = 


dL 

dq 


-1 


dL 

{Q^,P^), g{Q^,P^,U,) = -^{Q^,Q^) + F{Q,,Q,,Ui) 


such that the spRK scheme can be written as 


(31a) 


Qi 


Q^ 


=qo + h'^bjf{Qj,Pj) , 

S 

—Qo + h 'y ( ctij f {Qj t Pj )) 


Pi =po + h'^ bjg{Qj,Pj, Uj ), 

i=i 

S 

Pi ~Po “f k y ) (iij g{Qj., Pj , Uj ), 
i=i 


(31b) 
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with bittij + bjUji = bibj and bi = bj and the sG scheme reduces to 


(32a) 


qo 




1=1 


i=i 


1 * ^ 1 * 
1=1 * 1=1 


where biUij + bjUji = 0 and bi = bj. Remember that the coefficients aij are different for the two 
schemes (311 and (32) (see To approximate the cost functional C{q(t),p{t),u(t)) dt in 

(Sal we employ the same quadrature rule that we use to approximate the action on [0, h] {cf. ( |17[ ) 
and (251) such that the discrete density cost function Cd is defined by 

s ph 

Cd{{Q^,P^,U,}U)■■=hy2b,C{Q,,P,,U,)^ / C{qit),pit),uit))dt. 

^=l -bo 

So as to prevent a proliferation of symbols and alleviate the notation, along a time step interval 
[tk,tk+i], we write q^, p^ and instead of {gfe, {Qf }f=i, Qfc+i}, {pkAPi}Ui^Pk+i} and 
respectively. We drop the superscript k if we consider an arbitrary time step interval [0,/i]. 
With some abuse, along the whole interval [0,T], we equally write qh, Ph and Uh instead of 

{{Pk,Pt}i:tf-\PN} and respectively. 

With this notation we define the discrete cost function as 

N-l 

Jd{qh,Ph,Uh) ■■= Cd{ql,Ph,ul)+ ^[qN,PN) 

k=0 

and introduce the following two discrete optimal control problems, where the discretization in 


Problem |3. 3 1 is based on the spRK integrator and in Problem 3T on the sG integrator. 
Problem 3.3 (Discrete optimal control problem: the spRK case). 

(33a) min Jdiqh,Ph,Uh) 

qh,Ph,Uh 

subject to 

s s 

(33b) qk+i =qk + hY bjfiQj, Pj), Pk+1 =Pk+hY haiQj, Pj,u^) , 


i=i 


i=i 


(33c) 


=qk + hY<^^dfiQJ^PJ)^ 

i=i 


P^ =Pk+hYbn9{Q^j.PlU^) 

i=i 


/c = 0,..., — 1, i = 1,..., s, with bittij + bjttji = bibj and bi = bj, 

(33d) {qo,po) = {q°,p°), Ut&U. 

Problem 3.4 (Discrete optimal control problem: the sG case). 


(34a) 

subject to 
(34b) 


min Jd(qh,Ph,Uh) 

qh,Ph,Uh 


qk = 


qk+i=Yl^'Q"’ 


i=i 


i=i 


(34c) fiQt Pt) =IY ’ 9iQt Pt,Ut) ^ a^^pk ^ 


i=i 

k = Q,..., N — 1, i = , s, with biOij + bjCLji = 0 and bi = bj, 

(34d) {qo,Po) = {q°,P°), Uf&u. 


3 = 1 


Since Problem 3.3 has been extensively studied in (39] (as discussed in Section 3.4), in this work 
we focus on Problem (TD 
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3.4. Comparison of different solution methods. In Figure[^(see also [SS]) we present schemat¬ 
ically different discretization strategies for optimal control problems. Starting with the Lagrangian 
optimal control problem 2.1 we obtain via variation (for the derivation of the Euler-Lagrange 
equations) the optimal control problem 2.2 For its solution, direct or indirect methods can be 
employed (the differences of direct and indirect methods are already discussed in Section [^. 

In the DMOC approach, rather than discretizing the differential equations arising from the 
Lagrange-d’Alembert principle, we discretize in the earliest stage, namely already at the level of 
the variational principle. Then, we perform the variation only on the discrete level which results 
in a nonlinear programming problem (in particular we obtain the discrete Lagrangian optimal 
control problem 3.21 . Its necessary optimality conditions are derived by a dualization step as 
for a standard direct method. This approach that uses the concept of discrete mechanics leads 
to a special discretization of the system equations based on variational integrators. Thus, the 
discrete optimal control problem inherits special properties exhibited by variational integrators as 
extensively discussed in |39| . 



indirect 


direct 


DMOC 


Figure 1. Optimal control for mechanical systems: the order of variation, dualization and 
discretization for deriving the necessary optimality conditions. 


In this work we are interested in the question under which conditions the discretized necessary 
optimality conditions (discrete PMP) and the KKT conditions resulting from the discrete optimal 
control problems 


( |3.3[ ) and (3.41 (KKT) are identical. To this end, we summarize by now already 
known equivalence relations (A), (B) and (C) as indicated in Figure 

Equivalence (A). This equivalence corresponds to the commutation of variation and discretization. 
For particular variational integrators their equivalence to other well-known integration methods 
has been shown, e.g. the equivalence to the Stormer-Verlet method (|31]), to the Newmark al¬ 
gorithm m) or, more generally, to spRK methods (im I3S1 HU) applied to the corresponding 
Hamiltonian system. Whereas the variational derivation of spRK methods is an already known 
fact and was presented in Section |3.2.1| from a slightly different point of view, we found in Sec¬ 
tion |3]2]^ a new class of integrators, the sG methods, that applied to the Hamiltonian systems 
are equivalent to variational integrators based on the discrete Lagrangian (261. Obviously, if the 
variation of the discrete Lagrange-d’Alembert principle (the variational integrator) and the dis¬ 
cretization of the Euler-Lagrange or, equivalently, the Hamiltonian equations is the same, then, 
using the same discretization for the cost functional provides the same NLP in the middle and the 
right branch in Figure 
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Equivalence (B). Of course, if NLP and NLP are identical (equivalence (A)), then also the KKT 
and the KKT conditions are identical. 

Equivalence (C). This equivalence corresponds to the commutation of discretization and dualiza- 
tion. As already summarized in Section for the class of Legendre pseudospectral methods the 
commutation property has been proved if additional closure conditions are satisfied (see [THll48| L 
In |39| the commutation of discretization and dualization is proved for spRK methods, that means, 
the dualization of Problem |3.3| is the same as the discretization of the necessary optimality condi¬ 
tions using a spRK method. Due to equivalence (A), this commutation property also holds for high 


order variational integrators based on the discrete Lagrangian (18l. However, for general classes 


of variational integrators (in particular there exist high order variational integrators that are not 
equivalent to RK methods, see e.g. |38]), the commutation property is still an open question. As 
new contribution in this paper, we show in Section that discretization and dualization also com¬ 
mute for the sG integrator (see Theorem |5.2[ ) (i. e. the dualization of Problem |3.4| is the same as 
the discretization of the necessary optimality conditions using an sG method). We therefore find 
(besides the spRK methods) another class of variational integrators that fulfills the commutation 
property. 


4. Analysis of convergence of the primal variables 


In this section, we present one of the main results of the paper, that is the convergence of the 
primal variables. Before that, a couple of comments are necessary to clarify the assumptions and 
notation that appear in the statement. Examples will enlighten the result. 

We say that a function f: H —>■ M. is coercive, where H is a Hilbert space with norm || • || (in 
our case of study H is either K™ or L^([0, T], M™)), if there exists a scalar factor a > 0 for which 


lim inf 

IPII-J-OO 


fix) 

||tP 


> a. 


If / depends on a further variable, / = f{x,y), we say that / is uniformly coercive in x (with 
respect to y) if the coercivity factor a does not depend on y. 

In the next result, a discrete trajectory qh, either over a time step interval [0, h] or over the whole 
time interval [0,T], should be understood as a continuous trajectory along [0,/i] or [0,T], respec¬ 
tively. To that, on [0, h], qh = {^O; {Q^}i=l^ 9i} can be also viewed as its own linear interpolation, 
that is as the piecewise-linear continuous function qh'. [0, h] —K” determined by 


QhiO) = qo , qh{i ■ h/{s + 1)) =Qi, i = l,...,s, quik) = qi 


and linear in between. One proceeds similarly on [0,r] and as well for ph and Uh- 


Theorem 4.1. Given a Lagrangian function L: TQ —>■ K, an external control force F: TQ xU —>■ 
T*Q, a density cost function C: T*QxU —>■ K and a set of collocation points 0 < ci < ... < < 1 

defining a quadrature rule let us assume that 

(HI) L is regular; 

(H2) F is affine on the controls, i.e. F{q, q, u) = Fo{q, q) + u ■ Fi{q, q); 

(H3) C is uniformly coercive in u and smooth in iq,p); 

(H4) {OCP), the continuous Problem \2.f^ has a unique solution {q,p,u); 

(H5) bi > 0 for i = 1,..., s; and 

(H6) the associated spRK or sG scheme is convergent (for L, F and any fixed u). 

Then {qh,Ph,Uh) converges (up to subsequence) to {q,p,u) as h —>■ 0 (N oo), strongly in {q,p) 
and weakly in u, where (qh,Ph,Uh) is the solution to {OCP)h, the corresponding discrete Problem 
or\‘J.4\ 


Proof. The assumption of coercivity H3 on the density cost function C (recall T* is assumed to 
be bounded from below) implies the coercivity of the total cost functional J and, together with 
the assumption of positiveness |H5| on the weight coefficients bi, ensures also the coercivity of the 
discrete density cost Cd and of the discrete total cost Jd'. Indeed, from the definition and for Uh 
big enough. 


Cdiqh,Ph,Uh',h) 


> hi 


and 


Jd{qh,Ph,Uh',h) 


> T min 6, 


Uh\ 


Uh\ 
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where (qh,Ph,Uh) represents a one step trajectory on the left while a full trajectory on the right. 
The coercivity of J follows similarly by integration. We stress the fact that Jd is uniformly coercive 
with respect to the time step h. 

Besides we have from |Hl| that, for h small enough, the discrete Lagrangian is regular. There¬ 
fore for each control set Uh, a unique solution {qh,Ph) to the discrete Euler-Lagrange equations 
exists. The minimization in {OCP)h is then nothing but a finite dimensional minimization prob¬ 
lem whose solution existence is provided by the coercivity of Cd- We denote {qh,Ph,Uh) such 
solution. 

Since the discrete cost Jd is uniformly coercive with respect to h, the sequence (u/i), thought in 
L^([0, T], U), is bounded. Hence, up to subsequence, it converges to some control u G T^([0, T], U), 
for the weak topology of L^. It only remains to show that {q,p,u), where {q,p) is the unique 
solution of the mechanical system corresponding to the control u (hypothesis |H1[ ), is in fact the 
same point as {q,p,u). 

Firstly, by |H2| the controls enter the continuous and discrete dynamical equations (5b), (23aI, 


(23b I, (29a I and (29b I linearly. It follows that {qh,Ph) converges uniformly to (g, p) (more technical 
details on this standard reasoning may be found in |49|1. Secondly, we observe that the control 
u is optimal: Indeed, since the discrete cost Jd is an approximation to the continuous one J, we 
have that for some exponent r > I 

J{qh,Ph,Uh) = Jd{qh,Ph,Uh\h) + OQf) 

< Jd{{q,P,u)h-,h) + 0{h'") 

= J{q,p,u) + o{h^), 

where {q,p,u)h, for each h, represents simply the evaluation of {q,p,u) at the collocation points 
of each time interval. Passing to the limit, from the lower semi-continuity of J with respect to u 


(given by the integral expression and hypothesis H3l, it follows that 


J{q,P,u) < J{q,p,u) . 

Hence {q,p,u) is optimal and by uniqueness coincides with (q,p,u). 


□ 


Remark 4.2. Even though the rather “classical” definition of coercivity used here is perhaps less 
restrictive than the one given in [niiiH], there is no direct relation between them in the sense 
that none of them implies or is implied by the other. They have non-empty intersection. In there, 
the coercivity is a classical sufficient second-order assumption ensuring the absence of conjugate 
points. In here, the coercivity implies the existence of a (global) optimal control. The framework 


that 4.1 proposes has some advantages: It permits to prove without difficulty the existence of 
solutions for the discretized problem and it ensures the convergence from a simple topological 
argument. Moreover, the proof itself has the potential of being more general, for instance to 
consider final constraints (by means of finer arguments, the concept of end-point mapping, the 
general conjugate point theory). 

Remark 4.3. It is worth to note that the previous proof withstands some easy generalizations. If 
we now take more general dynamics (nonlinear force in u) and costs, then the above reasoning 
works as well provided the cost is coercive in some and the dynamics satisfy, for instance, 

'|4'(g,p, u)" 


lim sup ■ 


= 0, p>r, 


where 'I' = 0 stands for the dynamical constraints (5b I. 


Remark 4.4. A convergence proof (including consistency and stability) for general variational 
integrators (as assumed in H 61 is topic of ongoing research. For particular classes, the conver¬ 
gence is proven by showing that the variational integrator is equivalent to another well-known 
convergent method, as for example for symplectic partitioned Runge-Kutta methods. For a recent 
convergence analysis for Galerkin variational integrators by means of variational error analysis 
we refer to j^HI- The assumption H4 the uniqueness of the solution of (OCP), is a classical 


one. It can be weakened by stating the result in terms of closure points as follows. We assume 
that C^{[0,T],T*Q) is endowed with the uniform convergence topology and that L^([0,T],C/) is 
endowed with the weak topology. Then every closure point of the family of solutions (qf^PhyUh) 
of iOCP)h in C°([0,T],T*Q) x L^{[0,T],U) is a solution of (OCP). 
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Remark 4.5. In the formulation of the previous discrete optimal control problems |3.3| and |3.4[ 
we have chosen to discretize the control parameter and the cost functional in accordance to the 
dynamical discretization. Other possibilities are available which must be pondered. Let’s assume 
temporarily that, besides the original set of collocation points 0 < Ci < ... < Cg < 1, we have a 
couple of extra sets of them: 0 < di <...< dr < 1 and 0 < ei < ... < Ct < 1, for which U is 
determined by the former and J is discretized by the quadrature rule associated to the latter. That 
is, U'. [0, h] —>■ K is a polynomial of order r — 1 determined by r points Ui = U{dih), i = 1,..., r, 
and for which Uj := U{cjh), j = 1,..., s, and tJk ■= U{ekh), k = 1,..., t, are mere evaluations. 
And the cost function J is discretized by the formula h J2k=o Si=i hCiQi, Pi , Ui), where with 
a similar notation the “hat” stands for weights and evaluations related to the e’s. Now, different 
cases arise: 

• If r > t, one does an underevaluation of the controls within the discrete cost functional 
which will allow these to diverge (for instance a control could not appear explicitly in the 
discrete cost and therefore take any arbitrary value). 

• If on the contrary r < t, one does an overevaluation of the controls which, in the case of 
a coercive discrete cost function, will only increase the computational cost with, a priori, 
no convergence benefits. 

• Therefore, the case r = t seems to be the best fit, which is the case where there is a 
minimal number of evaluations of the controls (each control is evaluated just once in the 
discrete cost) so to have convergence under the assumption of coercivity. 

Assuming the last situation and continuing with the discussion, further cases arise: 

• On the one hand, if r > s, the convergence rate of the controls will be limited by the 
accuracy of the discrete dynamics. 

• On the other hand, if r < s, the convergence rate of the controls will suffer from a 
computational payload by the high accuracy of the dynamics. 

• Therefore, the case r = s seems again to be the best fit, which is the case that equates 
accuracy of the dynamics with convergence rate of the controls. 

Finally, under the assumption r = s = t, choosing a unique set of collocation points 0 < ci < 

... < Cg < 1 (we drop the “hats”, “bars”, e’s and d’s), one minimizes the number of polynomial 
evaluations and therefore the total computational cost (besides of simplifying the problem). 


In the following example, we solve a simple optimal control problem with a linear dynamical 
constraint and a quadratic cost function. The numerical experiments show, in the spirit of HZ] and 
the previous discussion before it, how a good choice and proper combination of the discretization 
gives a “fast” convergence of the scheme, while other combinations show “slow” convergence or 
even divergence of the controls, all of it exemplifying Theorem |4.1| 

Example 4.6. Consider the problem 


(35a) 

(35b) 

for which the functions 


min / (g 
9.9.“ Jo 

s.t. 


^)dt 


qit) = 


q = l + u, 

cosh(t) — 1 
cosh(T) 


(g(0),g(0)) = (0,0) 


and 


L{t) = 


cosh(t) 

cosh(T) 


- 1 


are the unique solution. We identify from the forced Euler-Lagrange equation (35b I the Lagrangian 
function L{q,q) = ^q^ + q and the control force F{q,q,u) = u. The density cost function is 
obviously C{q, q, u) = q^ . 

We discretize the mechanical system by using a symplectic Galerkin approach together with a 
Lobatto quadrature for s = 3 points. We initially assume that the controls are also discretized by 


r = 3 nodes. Then the right- hand 

side equations of ( 29b| or (34c 1 are 

(36a) 

Pi = 

+ 2 Q 2 ) = 1 -k C/i , 

(36b) 

P2 = 

k(~Qi + Qs) = 1 + C 2 , 

(36c) 

Ps = 

^ + ^(2Q2 — dQs) = ^ + U 3 , 
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where the micro-veloticies Qi are given by the 
particular case 


left equations of (29b I 


or (34c), which are in this 


/ Qi 

Q 2 

V Qs 



4 

0 

-4 



/ Qi 
Q 2 
\ Qs 


For the cost function, we consider four different discretizations with Lobatto’s quadrature rule 
for t = 1, 2, 3,4 quadrature points. These are respectively 


(37a) 

Cdiqh,Ph,Uh) = 

^ (qI + ^^ 2 ) I 

(37b) 

Cdiqh,Ph,Uh) = 

- (qI + qI + , 

(37c) 

Cdiqh,Ph,Uh) = 

\ {qI + ^Ql + Ql + Ul + 4C/| + c/|) , 

(37d) 

Cdiqh,Ph,Uh) = 

— ^(3 — -s/S) (Qi + (3 — V5)Q2 + Q 3 + 


+(1 ~ l/V^) ^(Qi + Q2)^ + (Q 2 + 

+ ^ (2(17i + U2f + (C/i - Uaf + 2(172 + C/ 3 )" 


+Ul + 12C/| + Ui) . 


The first two discretizations. Equations (37a) and (37b), are clearly not coercive with respect to 


the controls (Ui and U 3 are missing in (37a) and U 2 is missing in (37b)), which will be allowed 


to diverge (see Figures 2a and 2b). Nonetheless the last two discretizations. Equations (37c) and 


(37d), are indeed coercive, still one outperforms the other in terms of convergence (see Figures 2c 


and 2d). The discrete cost function (37d), besides of having a higher computational cost, shows a 


slower convergence rate. The discrete cost function (37c) corresponds to the method presented in 


Problem (34) and Theorem 4.1 


We continue by assuming that the approximated control U is determined only by two points, 
that is 

U{t) = Ui+t{U3-Ui) 

(we note t /3 instead of U 2 to make the notation more appealing). The previous set of Equations 
(36) and (37) are then updated by merely substituting the controls by 


Ui=Ui, U 2 = UUi + c/3) , and C /3 = C /3 , 


which leads to 


(38a) 

Cd{qh,Ph,Uh) 

— h (^Q 2 + {Ui + 1 / 3 )^/4^ , 

(38b) 

Cdiqh,Ph,Uh) 

= 2 + Q 3 + + ^ 3 ) 1 

(38c) 

Cd(qh,Ph,Uh) 

= ^{Ql + ^Ql + Ql + ui + iu, + U3)^ + ui) , 

(38d) 

Cd{qh,Ph,Uh) 

= — ^(3 — \fb) (^QI + (3 — V5)Q2 + Q 3 +) 

+(1 ~ l/v^) ^(Qi + < 52 )^ + {Q 2 + Qs)^^^ 
+ ^ (ui + [U, + U3f + ul) . 


In this occasion, only the first discretization. Equation (38a), defines a non-coercive discrete cost 
function (see Figure 3a). From the rest (see Figures 3b|3d ), Equations (38c) and (38d) show 
the fastest convergence rate, although slow on the controls and with a computational payload for 
(38d). Equation (38c) corresponds to a discretization of the cost with three quadrature points. 
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(b) cost function (37bl 

1 -error In q, slope=3.6507 1 
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2 
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(c) cost function (|37c[) 
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h 

(d) cost function (|37d[) 


Figure 2. Convergence behavior of the discrete solution for the discrete cost 


functions given in (371. 


- exact q 

-computed q 

- exact u 

• ■ computed u 


5 6 7 8 9 10 


(a) cost function (|38a| 


(b) cost function (|38b| 


I-error In q, slope=3.57331 ■ 
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(c) cost function (38c I 


(d) cost function (38dl 


Figure 3. Convergence behavior of the discrete solution for the discrete cost 


functions given in (381. 
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5. Commutation of discretization and dualization 


In this section we investigate the equivalence (C) in Figure for the special choice of sG 
discretization of the optimal control problem |2.2| To this end, we analyze and compare the 
adjoint systems for the continuous and the discrete optimal control problems. 

Throughout the section we assume that all controls under consideration do not saturate the 
constraints. In other words, we assume that the optimal control is in the interior of the set of 
constraints on controls. This is obviously the case if ?7 = K™. This assumption allows us to avoid 
the typical situation of bang-bang controls, and under slight extra conditions, to derive (from 
the Pontryagin Maximum Principle) extremal controls that are smooth functions of the state and 
costate. The necessary optimality conditions (10 1 are 


(39a) 

A = -V,C-A-VJ-i/.-V,g, 

A(T)=V,$(g(T),p(r)), 

(39b) 

Ip = -VpC - A • Vp/ - Ip ■ Vpg , 

V^(T) = v,$(g(T),p(r)), 

(39c) 

0 = VuC + Ip ■ Vu9 ■ 



If we use the sG integrator for the discretization of Problem |2.2[ we obtain the discretized optimal 
control problem |3.4[ To derive the necessary optimality conditions for the discretized optimal con¬ 
trol problem, we introduce the discrete adjoint vectors (covectors in K") Aq, ..., Xn, • i 
i/jQ, A°,..., ..., i = 1,..., s, and define the discrete optimal control Lagrangian 

as 


N-l s 


Cd = EE hb,Ci + $((7jv,P7v) - Ao • (go - 9°) - V'o • (po - P°) 


i—l 

N-l 

+ E 

/c=0 


Mfe • 1- E - E 

I V i=i 


+ ^A^ 


i=i 


(40) 


+ • hgf - 


k /3>/c+i - 




E “b-Pj 


i=i 


where Cf is a short notation for C{Q^,P^^Uf) (analogously for and gf). The necessary 
optimality conditions (KKT equations) are derived by differentiation w.r.t. the discrete variables 
qk,Pk, k = 0,..., N and P^,U^, k = 0,..., N — 1, i = 1,..., s, which leads to 


(41a) 

(41b) 

(41c) 

(41d) 

(41e) 

(41f) 

(41g) 

(41h) 


for fc = 0 , ..., iV - 1 : /ij, - Afc = 0 , 

Vg4)(gAr,pAr) — Aat = 0, 

-V'o + E ^4/0 = 0, 

forfc = l,...,7V-l: E F" E = «> 

^ - aVfc + /3*A,+i - ^ + Af • = 0, 

E + N ■ ^pft + 'f'f ■ Vpgf = 0, 

i=i 

6,V„C'f + vhf • = 0, 
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with k = 0,..., N — l,i = for the last three equations. We transform the necessary 

optimality conditions by defining 

(42a) T^-=A^/h and for k = 0,..., N - 1, i = 1,..., s, 

S S 

(42b) fc = 0,...,lV-l, and fc = 1,..., iV, 


2=1 


2 = 1 


such that Equation (41dl reduces to ip'j^ = := ijjk- By eliminating the variables fJ,o, ■ ■ ■ 

with Equation (41al and by exploiting the conditions on the coefficients 5^0^ + bjUji = 0 and 
bi = bi, we obtain the following discrete adjoint system 


(43a) 

(43b) 

(43c) 

(43d) 

(43e) 




j 




-v,cf - rf • v,/f - xt ■ v,<7f = 


/3*Afe+i — a*Afe 


hbi 




i=i 


-VpCf - . Vp/ - X? • =l E 


i=i 


V.C'f + x."-V„5f =0, 
for fc = 0,... , iV — 1, i = 1, ... , s, and with final conditions 
(43f) Atv = Vg4>(gAr,pAr) and i>N = '^p^{qN,PN) ■ 

where 

(43g) bittij + bjttji = 0 and bi = bj . 


Note that the adjoint scheme (43al-(43dl together with the final constraints (43fl and the condi¬ 
tions on the coefficients (43gl is exactly the symplectic Galerkin integrator applied to the adjoint 
system (39al-(39b). To ensure that the discrete adjoint system (431 is indeed equivalent to the 
necessary optimality conditions defined in (411 we show the following proposition. 


Proposition 5.1. If bi > 0 for each i, then the necessary optimality conditions (411 and the dis¬ 
crete adjoint system (431 are equivalent. That is, i/(/io,...,/iA-i, A°,..., T?,..., 41^“^), 

i = 1,..., s, satisfy (411, then (431 hold for {'ifk, hj’, xf) defined in (42a I and (42b I. Conversely, if 
[ifi ,... ,'i/;Ar,r°,... ,r, "\x?, ■ ■ i = l,...,s, satisfy (|4^, then (j^) hold for {^Xk, ,^^1) 

defined in (41a I and (42al. 


Proof. We already derived the adjoint system (431 starting from the necessary optimality condi- 


tions (411. We now suppose that (^i,..., i/jn, ...,. 


: Xi • 1 Xi 


), i = 1,..., s, satisfy 


the adjoint system (43l. Equation (41a I holds by assumption. The condition for Xn in (43fl and 
(41b I are identical. The condition for ipN in (43f I together with Equation (43b I for k = N — 1 and 
the definition (42a) yields (41el whereas Equation (43a) for fc = 0 together with definition (42a) 
yields (41c). By subtracting Equations (43a) and (43b) for the same index k and using defini¬ 
tion (42a) we obtain (41d). Finally, by taking the condition (43g) on the coefficients into account, 
(43c)-(43e) and definition (42a) yield (41f )-( 4lh| , respectively. □ 


With the classical Legendre assumption, i.e. {d^'H/du^){q* ,p*, u*, A, ip, 1) is a positive definite 
symmetric matrix, with Equation (39c) u can be expressed as function of the states and the 
adjoints, u = u{q,p, A, ip). We denote by v and ly the functions defined by 


v{q,P,X,'p) = {-AIqC{q,p,u) - A • '^qf{q,p) - tp ■ qg{q,p,u)) \u=uiq,p,\,^) , 
'n{q,P,\'ip) = {-^pC{q,p,u) - A • Vpf{q,p) - Ip ■ Wpg{q,p,u)) \u=u{q,p,x,^) ■ 
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With some abuse of notation, let g{q,p, X,ip) denote the function g[q,p^u{q,p,\,ip)). In the case 
where the control has the form Uf = u{Q^,P^,r^,Xi), the state and adjoint scheme based on 
the symplectic Galerkin integrator can be expressed as 


(44a) 

S 

qk = , 

s s s 

il,u=Y.a^xh iPk+i=Y.^'x) 

i=i i=i f=i 

(44b) 

i=i 

k P"Pk+i - a"Pk 1 'A fc 

“ hh 

(44c) 

i=i 

k - Q!*Afe 1 p: pfc 

hh 

k = 0,.. 

.,N-l,i = l,...,s, 


(44d) 

0 

II 

0 

0 

= P°, Atv = Vq<i)(gAr,pAr), ipN = X/p^{qN,PN), 


where haij + bjUji = 0 and bi = bj and where and gb are short notations for fiQi^Pi) and 
(analogously for 77 * and r-f). 

Scheme ( [44| can be viewed as symplectic Galerkin discretization of the two-point boundary 
value problem 


(45a) 

q = f{q,p), 

(45b) 

p = 9{q,p,\'f’), 

(45c) 

A = v{q,p,\,if ), 

(45d) 

^ = v{q,p,><,'^), 


q{ 0 ) = q° , 

p{0)=p°, 

A(T) = v,$(g(r),p(r)), 
7/-(T) = v,$((z(r),p(r)), 


where the variables (g, 7 /;) and (p, A) are treated in the same way, respectively. Since the same 
discrete scheme is used for state and adjoint system, the orders of approximation coincide. This 
leads to the following statement. 


Theorem 5.2 (Gommutation property). Given the (OCP) \2.2\ besides of HI H3 H4 from 


Theorem 4-1 we assume that {d'^'H/du'^){q* ,p*, u*, A, if, 1) is a positive definite symmetrie matrix. 
If a convergent symplectic Galerkin method with bi > 0, i = 1,..., s, is used for the discretization 


of the state system, dualization and discretization commute, i.e. the dualization of Problem 3.4 
coincides with the sG discretization of the boundary value problem (45). 


Remark 5.3. In spirit of the Govector Mapping Principle (see |l()|h the order-preserving map 
between the adjoint variables corresponding to the dualized discrete problem (KKT) and the 
discretized dual problem (discrete PMP) is given by Equation (|42a[). 


Remark 5.4. If the controls do not saturate the constraints, i.e. control constraints are active, the 
optimal solution is typically only Lipschitz continuous. Then we expect analogously to m that 
convergence rates are limited to order two even for higher order approximation schemes. 


6. Conclusions 

In this work, we investigate the application of high order variational integrators to the numerical 
solution of optimal control problems of mechanical systems. We derive two different schemes of 
high order variational integrators, the spRK and the sG method, which are both used for the 
discretization of a Lagrangian optimal control problem. The convergence of the primal variables 
of the resulting discrete optimal control problems is proven. Furthermore, the commutation of 
dualization and discretization for the sG method is shown, which extends the result in |39j to 
another class of variational integrators that fulfills this commutation property and directly implies 
that the Covector Mapping Principle is satisfied. In particular, due to the commutation, not 
only the order of the adjoint scheme but also the discretization method itself is preserved and 
in contrast to Legendre pseudospectral methods or classical Runge-Kutta methods, no additional 
closure conditions (see [16]) or conditions on the Runge Kutta coefficients (see [HI), respectively, 
are required. 
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The fulfillment of the Covector Mapping Principle provides a convenient way to prove the 
convergence of the dual variables (as done, for example, in CD)- With Theorem |5.2| the solution 
of the discretized direct problem coincides with the discrete solution of a shooting method applied 
to the necessary conditions of optimality. By showing the convergence of the shooting approach, 
we can conclude directly the convergence of the direct approach. The convergence proof for the 
adjoint variables is left for future work. 

In the present paper we restricted ourselves to optimal control problems without any constraint 
on the final state. If we consider more general optimal control problems, involving constraints on 
the final state, then we expect that we will have to use the general conjugate point theory (see 
0 ), in order to provide second-order conditions for optimality, related with the classical sensitivity 
analysis along a given optimal trajectory. Also, if there are some constraints on the final point then 
abnormal extremals may occur in the application of the Pontryagin Maximum Principle, which 
may raise a major problem in the analysis. Fortunately, it is known that abnormal minimizers do 
not exist under generic assumptions on the system and on the cost (see [9]), and we expect that 
the results presented in this paper may hold in such a generic context. Otherwise the possible 
presence of abnormal minimizers is responsible for a loss of compactness (in particular, adjoint 
vectors do not stay in a compact anymore, see |49|1. which may imply the failure of our method 
of proof, and it is not very clear then if we can expect that the Covector Mapping Principle hold 
true in that case. These general considerations will be investigated in future work. We stress 
again that, in the present paper, we have restricted ourselves to a more simple and tractable 
case. Once again, note that we have considered control-affine systems with (quasi)-quadratic 
costs, with assumptions ensuring the smoothness of optimal controls (as functions of the state and 
of the costate). As in [T7], our theory can be applied whenever there exist control constraints, 
provided the optimal controls under consideration belong to the interior of the set of constraints. 
Otherwise, in the general case controls may saturate the constraints, typically bang-bang controls 
do appear and then additional assumptions must be done in order to ensure a nice regularity 
of controls as functions of the state and of the costate. For instance, it is desirable to avoid 
chattering phenomena, in which a given optimal bang-bang control can have an infinite number 
of commutations over a compact interval. Also, we expect that one has to use the corresponding 
conjugate point theory in the bang-bang case. Such a theory does exist in the purely bang-bang 
case (see [51 [37]) but then requires additional assumptions (ruling out, in particular, chattering). 
It can be noted that a general conjugate point theory, involving all possible subarcs - free, bang, 
singular, boundary - is still to be done. In order to get a general Covector Mapping Principle, 
such a complete theory is certainly required. 

Another interesting issue is the application of more general symplectic and structure preserving 
methods in optimal control. Whereas the commutation property is shown for an already rich 
class of symplectic integrators (spRK and sG) it is still an open question, if it is satisfied for any 
variational and thus, any symplectic integrator (note that the classes of variational and symplectic 
integrators are identical, see e.g. m)- 
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